Preknowledge-based generalized association rules mining

نویسندگان

  • Yin-Fu Huang
  • Chieh-Ming Wu
چکیده

The subject of this paper is the mining of generalized association rules using pruning techniques. Given a large transaction database and a hierarchical taxonomy tree of the items, we attempt to find the association rules between the items at different levels in the taxonomy tree under the assumption that original frequent itemsets and association rules have already been generated in advance. The primary challenge of designing an efficient mining algorithm is how to make use of the original frequent itemsets and association rules to directly generate new generalized association rules, rather than re-scanning the database. In the proposed algorithms GMAR (Generalized Mining Association Rules) and GMFI (Generalized Mining Frequent Itemsets), we use join methods and/or pruning techniques to generate new generalized association rules. After several comprehensive experiments, we find that both algorithms are much better than BASIC and Cumulate algorithms, since they generate fewer candidate itemsets, and furthermore the GMAR algorithm prunes a large amount of irrelevant rules based on the minimum confidence.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Introducing an algorithm for use to hide sensitive association rules through perturb technique

Due to the rapid growth of data mining technology, obtaining private data on users through this technology becomes easier. Association Rules Mining is one of the data mining techniques to extract useful patterns in the form of association rules. One of the main problems in applying this technique on databases is the disclosure of sensitive data by endangering security and privacy. Hiding the as...

متن کامل

Maintenance of Generalized Association Rules for Record Deletion Based on the Pre-Large Concept

In the past, we proposed an incremental mining algorithm for maintenance of generalized association rules as new transactions were inserted. Deletion of records in databases is, however, commonly seen in real-world applications. In this paper, we thus attempt to extend our previous approach to solve this issue. The proposed algorithm maintains generalized association rules based on the concept ...

متن کامل

The fuzzy data mining generalized association rules for quantitative values

Due to the increasing use of very large databases and data warehouses, mining useful information and helpful knowledge from transactions is evolving into an important research area. Most conventional data-mining algorithms identify the relationships among transactions using binary values and find rules at a single concept level. Transactions with quantitative values and items with hierarchy rel...

متن کامل

A New Algorithm for Faster Mining of Generalized Association Rules

Generalized association rules are a very important extension of boolean association rules, but with current approaches mining generalized rules is computationally very expensive. Especially when considering the rule generation as being part of an interactive KDD-process this becomes annoying. In this paper we discuss strengths and weaknesses of known approaches to generate frequent itemsets. Ba...

متن کامل

A new approach based on data envelopment analysis with double frontiers for ranking the discovered rules from data mining

Data envelopment analysis (DEA) is a relatively new data oriented approach to evaluate performance of a set of peer entities called decision-making units (DMUs) that convert multiple inputs into multiple outputs. Within a relative limited period, DEA has been converted into a strong quantitative and analytical tool to measure and evaluate performance. In an article written by Toloo et al. (2009...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Journal of Intelligent and Fuzzy Systems

دوره 22  شماره 

صفحات  -

تاریخ انتشار 2011